On building predictive models with company annual reports
نویسندگان
چکیده
Text mining and machine learning methodologies have been applied to biomedicine and business domains for new relationship and knowledge discovery. Company annual reports (or 10K filings), as one of the most important mandatory information disclosures, have remained untapped by the text mining and machine learning community. Previous research indicates that the narrative disclosures in company annual reports can be used to assess the company’s short-term financial prospects. In this study, we apply text classification methods to 10K filings to systematically assess the predictive potential of company annual reports. We specify our research problem along five dimensions: financial performance indicators, choice of predictions, evaluation criteria, document representation, and experiment design. Different combinations of the choices we made along the five dimensions provide us with different perspectives and insights into the feasibility of using annual reports to predict company future performance. Our results confirm that predictive models can be successfully built using the textual content of annual reports. Mock portfolios constructed with firms predicted by the text-based model are shown to produce positive average stock return. Sub-sample experiments and post-hoc analysis further confirm that the text-based model is able to catch the textual differences among firms with different financial characteristics. We see a rich set of research questions with the promise of further insight in this research area. Abstract Approved: Thesis SupervisorApproved: Thesis Supervisor Title and Department
منابع مشابه
Towards Building Ranking Models with Annual Reports
AbstrAct: The textual content of company annual reports has proven to contain predictive indicators for the company fu ture performance. This paper addresses the general re search question of evaluating the effectiveness of applying machine learning and text mining techniques to building predictive mod els with annual reports. More specifically, we focus on these two questions: 1) the feasibi...
متن کاملExploring the Forecasting Potential of Company Annual Reports
Previous research indicates that the narration disclosure in company annual reports can be used to assist in assessing the company's short-term financial prospects. However, not much effort has been made to systematically and automatically assess the predictive potential of such reports using text classification, information retrieval, and machine learning techniques. In this study, we built SV...
متن کاملOPTIMIZATION-BASED MONITORING-SUPPORTED CALIBRATION OF A THERMAL PERFORMANCE SIMULATION MODEL
Building performance simulation is being increasingly deployed beyond the building design phase to support efficient building operation. Specifically, the predictive feature of the simulation-assisted building systems control strategy provides distinct advantages in view of building systems with high latency and inertia. Such advantages can be exploited only if model predictions can be relied u...
متن کاملCredit Risk Predictive Ability of G-ZPP Model Versus V-ZPP Model
Credit risk management is becoming more and more important in recent years. When a company deals with a financial problem, it may not be able to fulfill its financial obligations, which can cause direct and indirect financial losses to shareholders, creditors, investors and other people in the community. Advanced credit risk models that are based on market value include improving credit quality...
متن کاملAutomatic Assessment of Information Disclosure Quality in Chinese Annual Reports
Information disclosure in annual reports is a mandatory requirement for publicly traded companies in China. The quality of information disclosure will reduce information asymmetry and therefore support market efficiency. Currently, the evaluation of the information disclosure quality in Chinese reports is conducted manually. It remains an untapped field for NLP and text mining community. The go...
متن کامل